Sample-Optimal Low-Rank Approximation for PSD Matrices
May 6, 2020 (Zoom)

Abstract

Recently, Musco and Woodruff (FOCS, 2017) showed that given an n \times n positive semidefinite (PSD) matrix \mathbb{A}, it is possible to compute a (1+\epsilon)-approximate relative-error low-rank approximation to \mathbb{A} by querying \widetilde{O}(nk/\epsilon^{2.5}) entries of \mathbb{A} in time \widetilde{O}(nk/\epsilon^{2.5} +n k^{\omega-1}/\epsilon^{2(\omega-1)}). They also showed that any relative-error low-rank approximation algorithm must query \widetilde{\Omega}(nk/\epsilon) entries of \mathbb{A}, this gap has since remained open. Our main result is to resolve this question by obtaining an optimal algorithm that queries \widetilde{O}(nk/\epsilon) entries of \mathbb{A} and outputs a relative-error low-rank approximation in \widetilde{O}(n\cdot(k/\epsilon)^{\omega-1}) time. Note, our running time also improves that of Musco and Woodruff, and matches the information-theoretic lower bound if the matrix-multiplication exponent \omega is 2.

Our techniques extend to obtain sample-optimal algorithms for computing a low-rank approximation of Negative-type distance matrices and corsets for ridge regression.

This talk is based on joint work with Nadiia Chepurko and David Woodruff (https://arxiv.org/abs/1912.04177).